A Language - Independent Approach to Detection of Near - Miss Clones

نویسندگان

  • Nikita Synytskyy
  • James R. Cordy
  • Thomas R. Dean
چکیده

Previous research shows that most software systems contain significant amounts of duplicated, or cloned, code. Some clones are exact duplicates of each other, while others differ in small details only. We designate these almost-perfect clones " near-miss " clones. Detection of near-miss clones has many benefits, both academic and practical. Finding these clones can give us better insight into the way developers maintain and reuse code; we can also try to remove near-miss clones in an effort to reduce overall source code size and decrease system complexity. This paper presents a simple and effective way to detect near-miss clones, and shows the results of its application to several pieces of software. We use lexical comparison tools coupled with language-specific extractors to locate clones. Our approach separates code comparisons from code understanding, and makes the comparisons language independent. This makes it easy to adapt to different programming languages.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Practical language-independent detection of near-miss clones

Previous research shows that most software systems contain significant amounts of duplicated, or cloned, code. Some clones are exact duplicates of each other, while others differ in small details only. We designate these almost-perfect clones as “near-miss” clones. While technically difficult, detection of near-miss clones has many benefits, both academic and practical. Finding these clones can...

متن کامل

Detection and Analysis of Near-Miss Clone Genealogies

It is believed that identical or similar code fragments in source code, also known as code clones, have an impact on software maintenance. A clone genealogy shows how a group of clone fragments evolve with the evolution of the associated software system, and thus may provide important insights on the maintenance implications of those clone fragments. Considering the importance of studying the e...

متن کامل

Near-miss function clones in open source software: an empirical study

The new hybrid clone detection tool NICAD combines the strengths and overcomes the limitations of both text-based and AST-based clone detection techniques and exploits novel applications of a source transformation system to yield highly accurate identification of cloned code in software systems. In this paper, we present an in-depth study of near-miss function clones in open source software usi...

متن کامل

Exact and Near-miss Clone Detection in Spreadsheets

Spreadsheets are used extensively in business, in many domains. The applicability of software engineering methods to spreadsheets has been a topic of research for several years [2], but the main focus has been on analyzing the formulas, and not on analyzing the data in the spreadsheets. One of the factors that plays a role in spreadsheet data quality is the occurrence of clones in the spreadshe...

متن کامل

Near-miss modeling: a segment-based approach to speech recognition

Currently, most approaches to speech recognition are frame-based in that they represent speech as a temporal sequence of feature vectors. Although these approaches have been successful, they cannot easily incorporate complex modeling strategies that may further improve speech recognition performance. In contrast, segment-based approaches represent speech as a temporal graph of feature vectors a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004